智能论文笔记

On the modern deep learning approaches for precipitation downscaling

Bipin Kumar , Kaustubh Atey , Bhupendra Bahadur Singh , Rajib Chattopadhyay , Nachiket Acharya , Manmeet Singh , Ravi S. Nanjundiah , Suryachandra A. Rao

分类：机器学习

2022-07-02

基于深度学习（DL）的降尺度已成为地球科学中的流行工具。越来越多的DL方法被采用来降低降水量的降水量数据，并在局部（〜几公里甚至更小）的尺度上产生更准确和可靠的估计值。尽管有几项研究采用了降水的动力学或统计缩减，但准确性受地面真理的可用性受到限制。衡量此类方法准确性的一个关键挑战是将缩小的数据与点尺度观测值进行比较，这些观察值通常在如此小的尺度上是无法使用的。在这项工作中，我们进行了基于DL的缩减，以估计印度气象部（IMD）的当地降水数据，该数据是通过近似从车站位置到网格点的价值而创建的。为了测试不同DL方法的疗效，我们采用了四种不同的缩小方法并评估其性能。所考虑的方法是（i）深度统计缩小（DEEPSD），增强卷积长期记忆（ConvlstM），完全卷积网络（U-NET）和超分辨率生成对抗网络（SR-GAN）。 SR-GAN中使用的自定义VGG网络是在这项工作中使用沉淀数据开发的。结果表明，SR-GAN是降水数据缩减的最佳方法。 IMD站的降水值验证了缩小的数据。这种DL方法为统计缩减提供了有希望的替代方法。

translated by 谷歌翻译

RECOMMED: A Comprehensive Pharmaceutical Recommendation System

Mariam Zomorodi , Ismail Ghodsollahee , Pawel Plawiak , U. Rajendra Acharya

分类：人工智能

2022-12-31

A comprehensive pharmaceutical recommendation system was designed based on the patients and drugs features extracted from Drugs.com and Druglib.com. First, data from these databases were combined, and a dataset of patients and drug information was built. Secondly, the patients and drugs were clustered, and then the recommendation was performed using different ratings provided by patients, and importantly by the knowledge obtained from patients and drug specifications, and considering drug interactions. To the best of our knowledge, we are the first group to consider patients conditions and history in the proposed approach for selecting a specific medicine appropriate for that particular user. Our approach applies artificial intelligence (AI) models for the implementation. Sentiment analysis using natural language processing approaches is employed in pre-processing along with neural network-based methods and recommender system algorithms for modeling the system. In our work, patients conditions and drugs features are used for making two models based on matrix factorization. Then we used drug interaction to filter drugs with severe or mild interactions with other drugs. We developed a deep learning model for recommending drugs by using data from 2304 patients as a training set, and then we used data from 660 patients as our validation set. After that, we used knowledge from critical information about drugs and combined the outcome of the model into a knowledge-based system with the rules obtained from constraints on taking medicine.

translated by 谷歌翻译

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

Jimmy Z. Di , Jack Douglas , Jayadev Acharya , Gautam Kamath , Ayush Sekhari

分类：机器学习 | 人工智能

2022-12-21

We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced. An adversary first adds a few carefully crafted points to the training dataset such that the impact on the model's predictions is minimal. The adversary subsequently triggers a request to remove a subset of the introduced points at which point the attack is unleashed and the model's predictions are negatively affected. In particular, we consider clean-label targeted attacks (in which the goal is to cause the model to misclassify a specific test point) on datasets including CIFAR-10, Imagenette, and Imagewoof. This attack is realized by constructing camouflage datapoints that mask the effect of a poisoned dataset.

translated by 谷歌翻译

GenSyn: A Multi-stage Framework for Generating Synthetic Microdata using Macro Data Sources

Angeela Acharya , Siddhartha Sikdar , Sanmay Das , Huzefa Rangwala

分类：机器学习

2022-12-08

Individual-level data (microdata) that characterizes a population, is essential for studying many real-world problems. However, acquiring such data is not straightforward due to cost and privacy constraints, and access is often limited to aggregated data (macro data) sources. In this study, we examine synthetic data generation as a tool to extrapolate difficult-to-obtain high-resolution data by combining information from multiple easier-to-obtain lower-resolution data sources. In particular, we introduce a framework that uses a combination of univariate and multivariate frequency tables from a given target geographical location in combination with frequency tables from other auxiliary locations to generate synthetic microdata for individuals in the target location. Our method combines the estimation of a dependency graph and conditional probabilities from the target location with the use of a Gaussian copula to leverage the available information from the auxiliary locations. We perform extensive testing on two real-world datasets and demonstrate that our approach outperforms prior approaches in preserving the overall dependency structure of the data while also satisfying the constraints defined on the different variables.

translated by 谷歌翻译

Improved Deep Neural Network Generalization Using m-Sharpness-Aware Minimization

Kayhan Behdin , Qingquan Song , Aman Gupta , David Durfee , Ayan Acharya , Sathiya Keerthi , Rahul Mazumder

分类：机器学习

2022-12-07

Modern deep learning models are over-parameterized, where the optimization setup strongly affects the generalization performance. A key element of reliable optimization for these systems is the modification of the loss function. Sharpness-Aware Minimization (SAM) modifies the underlying loss function to guide descent methods towards flatter minima, which arguably have better generalization abilities. In this paper, we focus on a variant of SAM known as mSAM, which, during training, averages the updates generated by adversarial perturbations across several disjoint shards of a mini-batch. Recent work suggests that mSAM can outperform SAM in terms of test accuracy. However, a comprehensive empirical study of mSAM is missing from the literature -- previous results have mostly been limited to specific architectures and datasets. To that end, this paper presents a thorough empirical evaluation of mSAM on various tasks and datasets. We provide a flexible implementation of mSAM and compare the generalization performance of mSAM to the performance of SAM and vanilla training on different image classification and natural language processing tasks. We also conduct careful experiments to understand the computational cost of training with mSAM, its sensitivity to hyperparameters and its correlation with the flatness of the loss landscape. Our analysis reveals that mSAM yields superior generalization performance and flatter minima, compared to SAM, across a wide range of tasks without significantly increasing computational costs.

translated by 谷歌翻译

Discrete Distribution Estimation under User-level Local Differential Privacy

Jayadev Acharya , Yuhan Liu , Ziteng Sun

分类：机器学习

2022-11-07

We study discrete distribution estimation under user-level local differential privacy (LDP). In user-level $\varepsilon$-LDP, each user has $m\ge1$ samples and the privacy of all $m$ samples must be preserved simultaneously. We resolve the following dilemma: While on the one hand having more samples per user should provide more information about the underlying distribution, on the other hand, guaranteeing the privacy of all $m$ samples should make the estimation task more difficult. We obtain tight bounds for this problem under almost all parameter regimes. Perhaps surprisingly, we show that in suitable parameter regimes, having $m$ samples per user is equivalent to having $m$ times more users, each with only one sample. Our results demonstrate interesting phase transitions for $m$ and the privacy parameter $\varepsilon$ in the estimation risk. Finally, connecting with recent results on shuffled DP, we show that combined with random shuffling, our algorithm leads to optimal error guarantees (up to logarithmic factors) under the central model of user-level DP in certain parameter regimes. We provide several simulations to verify our theoretical findings.

translated by 谷歌翻译

FedStack: Personalized activity monitoring using stacked federated learning

Thanveer Shaik , Xiaohui Tao , Niall Higgins , Raj Gururajan , Yuefeng Li , Xujuan Zhou , U Rajendra Acharya

分类：机器学习 | 人工智能

2022-09-27

远程患者监测（RPM）系统的最新进展可以识别各种人类活动，以测量生命体征，包括浅表血管的细微运动。通过解决已知的局限性和挑战（例如预测和分类生命体征和身体运动），将人工智能（AI）应用于该领域的医疗保健领域越来越兴趣，这些局限性和挑战被认为是至关重要的任务。联合学习是一种相对较新的AI技术，旨在通过分散传统的机器学习建模来增强数据隐私。但是，传统的联合学习需要在本地客户和全球服务器上培训相同的建筑模型。由于缺乏本地模型异质性，这限制了全球模型体系结构。为了克服这一点，在本研究中提出了一个新颖的联邦学习体系结构Fedstack，该体系支持结合异构建筑客户端模型。这项工作提供了一个受保护的隐私系统，用于以分散的方法住院的住院患者，并确定最佳传感器位置。提出的体系结构被应用于从10个不同主题的移动健康传感器基准数据集中，以对12个常规活动进行分类。对单个主题数据培训了三个AI模型ANN，CNN和BISTM。联合学习体系结构应用于这些模型，以建立能够表演状态表演的本地和全球模型。本地CNN模型在每个主题数据上都优于ANN和BI-LSTM模型。与同质堆叠相比，我们提出的工作表明，当地模型的异质堆叠表现出更好的性能。这项工作为建立增强的RPM系统奠定了基础，该系统纳入了客户隐私，以帮助对急性心理健康设施中患者进行临床观察，并最终有助于防止意外死亡。

translated by 谷歌翻译

Employing Feature Selection Algorithms to Determine the Immune State of Mice with Rheumatoid Arthritis

Brendon K. Colbert , Joslyn L. Mangal , Aleksandr Talitckii , Abhinav P. Acharya , Matthew M. Peet

分类： (统计)机器学习 | 机器学习

2022-07-12

免疫反应是一个动态过程，通过该过程，身体决定抗原是自我还是非自然。这种动态过程的状态由构成该决策过程的炎症和监管参与者的相对平衡和种群定义。免疫疗法的目的，例如因此，类风湿关节炎（RA）是为了使免疫状态偏向于监管参与者，从而在反应中关闭自身免疫性途径。尽管有几种已知的免疫疗法方法，但治疗的有效性将取决于这种干预措施如何改变该状态的演变。不幸的是，此过程不仅取决于该过程的动力学，而且是在干预时的系统状态决定的 - 这种状态在应用治疗之前很难确定即使不是不可能的状态。

translated by 谷歌翻译

Short-range forecasts of global precipitation using using deep learning-augmented numerical weather prediction

Manmeet Singh , Vaisakh S B , Nachiketa Acharya , Suryachandra A Rao , Bipin Kumar , Zong-Liang Yang , Dev Niyogi

分类：人工智能 | 计算机视觉

2022-06-23

降水控制地球气候，其日常时空波动具有重大的社会经济影响。通过改善温度和压力等各种物理领域的预测来衡量数值天气预测（NWP）的进步；然而，降水预测中存在很大的偏见。我们通过深度学习来增强著名的NWP模型CFSV2的输出，以创建一个混合模型，该模型在1日，2天和3天的交货时间内改善了短期全局降水量。为了混合使用，我们通过使用修改的DLWP-CS体系结构来解决全局数据的球形，从而将所有字段转换为立方体投影。动态模型沉淀和表面温度输出被喂入改良的DLWP-CS（UNET），以预测地面真相降水。虽然CFSV2的平均偏差为土地+5至+7毫米/天，但多元深度学习模型将其降低到-1至+1 mm/天。卡特里娜飓风在2005年，伊万飓风，2010年的中国洪水，2005年的印度洪水和2008年的缅甸风暴纳尔吉斯（Myanmar Storm Nargis）用于确认混合动力学深度学习模型的技能大大提高。 CFSV2通常在空间模式中显示中度至大偏置，并在短期时间尺度上高估了沉淀。拟议的深度学习增强了NWP模型可以解决这些偏见，并大大改善了预测降水的空间模式和幅度。与CFSV2相比，深度学习增强了CFSV2在重要的土地区域的平均偏差为1天铅1天。时空深度学习系统开辟了途径，以进一步提高全球短期降水预测的精度和准确性。

translated by 谷歌翻译

Uncertainty Quantification for Competency Assessment of Autonomous Agents

Aastha Acharya , Rebecca Russell , Nisar R. Ahmed

分类：机器人 | 人工智能 | 机器学习

2022-06-21

为了在现实世界中安全可靠的部署，自主代理必须引起人类用户的适当信任。建立信任的一种方法是让代理评估和传达自己执行给定任务的能力。能力取决于影响代理的不确定性，使得准确的不确定性量化对于能力评估至关重要。在这项工作中，我们展示了如何使用深层生成模型的集合来量化代理商的态度和认知不确定性，当预测任务结果作为能力评估的一部分。

translated by 谷歌翻译